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DETAILED ACTION 

Response to Arguments 

1 . Applicant's arguments filed 10/14/08 have been fully considered but they are not 

persuasive. 

Applicant argues that neither Laroche nor Weare et al., nor McEachern teach 
determining whether said subsequent media program subset exhibits similarities to said 
initial media program subset (Amendment, pages 12-18). 

The examiner disagrees, since Weare et al., disclose "a media entity is received 
by the system and the data is converted from the time domain to the frequency domain. 
For each frame of data, critical band filtering is performed on the data. Once enough 
feature vectors are added to the classification chain, the classification chain is ready for 
operation. The methods also help to determine media entities that have similar or 
dissimilar as a request may indicate, melodic movement by utilizing classification chain 
techniques" (col.5, lines 15-22; col. 16, lines 36 - 65). 

Claim Rejections - 35 USC §112 

2. The following is a quotation of the second paragraph of 35 U.S.C. 1 1 2: 

The specification sliall conclude witli one or more claims particularly pointing out and distinctly 
claiming the subject matter which the applicant regards as his invention. 

3. Claim 33 recites the limitation "said repetition" in lines 2, and 3. There is 
insufficient antecedent basis for this limitation in the claim. 
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Claim Rejections - 35 USC § 103 

4. The text of those sections of Title 35, U.S. Code not included in this action can 
be found in a prior Office action. 

Claims 1 - 4, 6- 16, 18, 21 - 23, 37, and 41 - 45 are rejected under 35 U.S.C. 
103(a) as being unpatentable over Weare et al., (US Patent 7,065,416) in view of 
McEachern (US Patent 5,615,302). 

Regarding claim 1 , Weare et al. discloses a method for program content 
identification (see col. 6, lines 22-27), said method comprising the steps of: 

for each of at least two media program subsets, performing the steps of (col.5, 
lines 15-22): 

filtering each first frequency domain representation of blocks of said media 
program subset using a plurality of filters to develop a respective second frequency 
domain representation of each of said blocks of said media said second frequency 
domain representation of each of said blocks having a reduced number of frequency 
coefficients with respect to said first frequency domain representation program (see col. 
16, lines 47, fig. 7, element 750, describing a critical band filtering step which can be 
modeled as a filter bank, thus indicating that a plurality of filters exist); 

grouping frequency coefficients of said second frequency domain representation 
of said blocks to form segments (see fig. 8A element 804, col. 17, lines 57-60, and col. 
16, lines 25-30, where critical band filtering forms several critical bands, interpreted by 
the examiner as groups); and selecting a plurality of said segments (see col. 18, lines 
10-15, where the peaks with the highest energies are selected); 
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comparing selected segments to features of stored programs to identify tliereby 
said media program subset ("classification of media entities"; Abstract, lines 3 -7); 

determining whether said subsequent media program subset exhibits similarities 
to said initial media program subset (col.5, lines 15-22; col. 16, lines 36 - 65). 

However, Weare et al., do not specifically teach that said plurality of filters have 
center frequencies logarithmically spaced apart from each other with a logarithmic 
additive factor of 1/1 2. 

McEachern teaches this 1/12 octave filter center frequency spacing results in 
logarithmically spaced filters that are very closely centered at the frequencies of the 
linearly spaces harmonics (col. 12, line 66 - col. 13, line 2). 

Therefore, it would have been obvious to one of ordinary skill in the art at the 
time the invention was made to use logarithm filters as taught by McEachern in Weare 
et al., because that would help better recognize the media content. 

Regarding claim 2, Weare et al. further disclose that each grouping of frequency 
coefficients of said second frequency domain to form a segment represents blocks that 
are consecutive in time in said media program (see. Col. 18, lines 10-15, since the 
peaks with highest energies are selected it follows that the segments may be 
contiguous in time if two highest peaks are positioned 
consecutively). 
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Regarding claim 3, Weare et al. further disclose that said plurality of filters are 
arranged in a group that processes a block at a time, the portion of Said second 
frequency domain representation produced by said group for each block forms a frame, 
and wherein at least two frames are grouped to form a segment (see col. 18, where 
peaks last for multiple frames, thereby having a segment at least two frames). 

Regarding claim 4, Weare et al. further disclose that said selected segments 
correspond to portions of said media program that are not contiguous in time (see col. 
18, lines 10-15, since the peaks with the highest energies are selected, it follows that 
the segments may not be contiguous if a peak that does not meet the criteria "highest" 
is positioned between two "highest" peaks). 

Regarding claim 7, Weare et al. further disclose that the segments selected in 
said selecting step are those that have largest minimum segment energy (see col. 18, 
lines 10-15). 

Regarding claim 8, Weare et al. further disclose that the segments selected in 
said selecting step are selected in accordance with prescribed constraints (see col. 18, 
line 66 - col. 19 line 2, where only selecting peaks that last for more than specified 
number of frames prevents the peaks from being too close). 
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Regarding claim 9, Weare et al. further disclose that the segments selected in 
said selecting step are selected for portions of said media program that correspond in 
time to prescribed search windows that are separated by gaps (see col. 19, lines 5-10 
where frames correspond to search windows, and the frames are individual thus, there 
is a separation by gaps). 

Regarding claim 10, Weare et al. further disclose that the segments selected in 
said selecting step are those that result in the selected segments having a maximum 
entropy over the selected segments (see col. 18, lines 12-15, where the most energetic 
peaks are chosen, thus choosing the most entropic peaks). 

Regarding claims 11-13, Weare et al. further disclose that the step of 
normalizing said frequency coefficients in said second frequency domain representation 
after performing said grouping step, said normalization being performed on a per- 
segment basis; wherein said normalization includes performing at least a preceding- 
time normalization; an L2 normalization ("normalizing the sum"; see col. 16, lines 3-6). 

Regarding claim 14, Weare et al. further disclose that the step of storing said 
selected segments in a database in association with an identifier of said media program 
(see col. 7, lines 59-65, where music is stored in a database and for generating play 
lists thus an identifier must be associated with the stored data). 
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Regarding claim 15, Weare et al. further disclose that the step of storing in said 
database information indicating timing of said selected segments (see col. 9, lines 16- 
21 , where classifying the tempo in the database indicates timing of media segment). 

Regarding claim 16, Weare et al. further disclose that said first frequency domain 
representation of blocks of said media program is developed by the steps of: digitizing 
an audio representation of said media program to be stored in said database (see col. 
16, lines 41-44); dividing the digitized audio representation into blocks of a prescribed 
number of samples (see col. 16, lines 41-44, where the audio representation is divided 
into frames); smoothing said blocks using a filter (see col. 16, lines 45-47); and 

converting said smoothed blocks into the frequency domain, wherein said 
smoothed blocks are represented by frequency coefficients (see col. 16, lines 39- 41). 

Regarding claim 18, Weare et al. further disclose that each of said smoothed 
blocks are converted into the frequency domain in said converting step using a Fast 
Fourier Transform (FFT) (see col. 16, lines 39-41 and col. 23, lines 52-54). 

Regarding claims 21 - 23, and 37, Weare et al. discloses identification of content 
identification (see col. 6, lines 22-27), comprising: 

for each of at least two media program subsets, performing the steps of (col.5, 
lines 15-22): 
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filtering eacli first frequency domain representation of blocks of said media 
program subset using a plurality of filters to develop a respective second frequency 
domain representation of each of said blocks of said media said second frequency 
domain representation of each of said blocks having a reduced number of frequency 
coefficients with respect to said first frequency domain representation program (see col. 
16, lines 47, fig. 7, element 750, describing a critical band filtering step which can be 
modeled as a filter bank, thus indicating that a plurality of filters exist); 

grouping frequency coefficients of said second frequency domain representation 
of said blocks to form segments (see fig. 8A element 804, col. 17, lines 57-60, and col. 
16, lines 25-30, where critical band filtering forms several critical bands, interpreted by 
the examiner as groups); and selecting a plurality of said segments (see col. 18, lines 
10-15, where the peaks with the highest energies are selected); 

However, Weare et al., do not specifically teach that said plurality of filters have 
center frequencies logarithmically spaced apart from each other with a logarithmic 
additive factor of 1/1 2. 

McEachern teaches this 1/12 octave filter center frequency spacing results in 
logarithmically spaced filters that are very closely centered at the frequencies of the 
linearly spaces harmonics (col. 12, line 66 - col. 13, line 2). 

Therefore, it would have been obvious to one of ordinary skill in the art at the 
time the invention was made to use logarithm filters as taught by McEachern in Weare 
et al., because that would help better recognize the media content. 
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Weare et al., in view of McEachern do not specifically teach storing at least 30 
minutes worth of segments. However, since Weare et al., teach storage collection of 
media entities, such as media entities that are audio files, or have portions that are 
audio files (col.5, lines 13-18). One having ordinary skill in the art at the time the 
invention was made would have found it obvious to store at least 30 minutes worth of 
segments in Weare et al., in view of McEachern, because that would help determine 
media entities that have similar or dissimilar (col.5, lines 1 8 - 22). 

As per claims 41 - 45, Weare et al., further disclose at least two of said media 
subsets are associated with the same media program; at least two of said media 
subsets are associated with different media program ("media entities that are audio files 
or have portions that are audio files'; Abstract). 

5. Claims 24 - 30, 32 - 36 are rejected under 35 U.S.C. 1 03(a) as being 
unpatentable over Laroche (US Patent 6,453,252) in view of McEachern (US Patent 
5,615,302), and further in view of Weare et al., (US Patent 7,065,416) 

Regarding claims 24, 34, 35, Laroche discloses identification of content 
identification (see col. 6, lines 22-27), comprising: 

for each of at least two media program subsets, performing the steps of (col.5, 
lines 15-22): 

filtering each first frequency domain representation of blocks of said media 
program subset using a plurality of filters to develop a respective second frequency 



Application/Control Number: 10/629,486 Page 10 

Art Unit: 2626 

domain representation of eacli of said blocl^s of said media said second frequency 
domain representation of each of said blocl<s liaving a reduced number of frequency 
coefficients with respect to said first frequency domain representation program ( see fig. 
1 and col. 2, lines 36-48); 

grouping frequency coefficients of said second frequency domain representation 
of said blocks to form segments(see col. 2, lines 46-48); and searching a database for 
substantially matching segments, said database having stored therein segments of 
media programs and respective corresponding program identifiers (see col. 4, lines 33- 
34). 

However, Laroche et al., do not specifically teach that said plurality of filters have 
center frequencies logarithmically spaced apart from each other with a logarithmic 
additive factor of 1/12. 

McEachern teaches this 1/12 octave filter center frequency spacing results In 
logarithmically spaced filters that are very closely centered at the frequencies of the 
linearly spaces harmonics (col. 12, line 66 - col. 13, line 2). 

Therefore, it would have been obvious to one of ordinary skill in the art at the 
time the Invention was made to use logarithm filters as taught by McEachern in 
Laroche, because that would help better recognize the media content. 

However Laroche in view of McEachern do not specifically teach determining 
whether said subsequent media program subset exhibits similarities to said initial media 
program subset. 
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Weare et al., teach a media entity is received by the system and the data is 
converted from the time domain to the frequency domain. For each frame of data, 
critical band filtering is performed on the data. Once enough feature vectors are added 
to the classification chain, the classification chain is ready for operation. The methods 
also help to determine media entities that have similar or dissimilar as a request 
may indicate, melodic movement by utilizing classification chain techniques" (col. 5, lines 
15-22; col. 16, lines 36-65). 

Therefore, it would have been obvious to one of ordinary skill in the art at the 
time the invention was made to determine similarities between media entities as taught 
by Weare et al., in Laroche in view of McEachern, because that would help classify 
media entities (col.5, lines 7-12). 

Regarding claim 25, Laroche further discloses that the step of indicating that said 
media program cannot be identified when substantially matching segments are not 
found in said database in said searching step (see col. 4, lines 38-42, where the value 
indicates if there is a true match or not). 

Regarding claim 26, Laroche further discloses that said data base includes 
information indicating timing of segments of each respective media program identified 
therein (see col. 4, line 64- col. 5, line 5), and wherein a match may be found in said 
searching step only when the timing of said segments produced in said grouping step 
substantially matches the timing of said segments stored in said database (see col. 5, 
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lines 5-10, where fingerprints taken at other maxima will not fit, thus the match will only 
be found when the timing segments match). 

Regarding claim 27, Laroche further discloses that said matching between 
segments is based on the Euclidean distances between segments (see col. 4, lines 34- 
38). 

Regarding claim 28, Laroche further discloses that the step of identifying said 
media program as being the media program indicated by the identifier stored in said 
database having a best matching score when substantially matching segments are 
found in said database in said searching step (see col. 4, lines 38-42, where the match 
is determined by the smallest value, where larger values may match substantially, but 
are not indicated as the best match). 

Regarding claim 29, Laroche further discloses that the step of determining a 
speed differential between said media program and a media program identified in said 
identifying step (see col. 3, lines 64-67, where two signals can differ by a slowly time- 
varying function). 

Regarding claim 32, Laroche further discloses that said determining step is 
based on an overlap score (see claim 6, where an identifying method is claimed based 
on a segment divided into overlapping frames). 
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Regarding claim 36, Laroche further discloses that said first frequency domain 
representation of said media program comprises a plurality of blocks of coefficients 
corresponding to respective time domain sections of said media program (see col. 2, 
lines 36-40) and said second frequency domain representation of said media program 
comprises a plurality of blocks of coefficients corresponding to respective time domain 
sections of said media program (see col. 2, lines 42-48). 

Regarding claims 30, and 33, Laroche in view of McEachern, and further in view 
of Weare et al., do not disclose wherein said matching score for a program P.sub.i is 



determined by 

However, since Weare et al., teach nearest neighbor and/or other matching 
algorithms may be utilized to locate songs that are similar... a confidence level for song 
classification may also be returned (col. 8, lines 1 - 10). One having ordinary skill in the 
at the time the invention was made would have found it obvious to use a matching score 
in Laroche in view of McEachern, and further in view of Weare et al., because that 
would help classify media entities (col.5, lines 7-12). 

6. Claims 5, 6, 17, and 19 are rejected under 35 U.S.C. 103(a) as being 
unpatentable over Weare et al., (US Patent 7,065,416) in view of McEachern (US 
Patent 5,615,302), and further in view of Rahim et al.,(US Patent 7181399) 
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Regarding claims 5, 6, 17, and 19, Weare et al. in view of McEachern do not 
specifically disclose wherein said plurality of filters includes at least a set of triangular 
filters; said smoothing step is a Hamming window filter; smoothed blocks are converted 
into the frequency domain in said converting step using a Discrete Cosine Transform 
(DCT). 

Rahim et al., teach each frame is hamming windowed, Fourier transformed and 
then pass passed through a set of twenty-two triangular band-pass filters. Twelve mel 
cepstral coefficient are computed by applying the inverse discrete cosine transform 
(col.4, lines 15-22). 

Therefore, it would have been obvious to one of ordinary skill in the art at the 
time the invention was made to use triangular filters as taught by Rahim et al., in Weare 
et al. in view of McEachern, because that would improve the degree of smoothing of 
different blocks. 

Conclusion 

7. Applicant's amendment necessitated the new ground(s) of rejection presented in 
this Office action. Accordingly, THIS ACTION IS MADE FINAL. See MPEP 
§ 706.07(a). Applicant is reminded of the extension of time policy as set forth in 37 
CFR 1.136(a). 

A shortened statutory period for reply to this final action is set to expire THREE 
MONTHS from the mailing date of this action. In the event a first reply is filed within 
TWO MONTHS of the mailing date of this final action and the advisory action is not 
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mailed until after the end of the THREE-MONTH shortened statutory period, then the 
shortened statutory period will expire on the date the advisory action is mailed, and any 
extension fee pursuant to 37 CFR 1 .136(a) will be calculated from the mailing date of 
the advisory action. In no event, however, will the statutory period for reply expire later 
than SIX MONTHS from the date of this final action. 

Any Inquiry concerning this communication or earlier communications from the 
examiner should be directed to LEONARD SAINT CYR whose telephone number is 
(571 ) 272-4247. The examiner can normally be reached on Mon- Friday. 

If attempts to reach the examiner by telephone are unsuccessful, the examiner's 
supervisor, Richemond Dorvil can be reached on (571) 272-7602. The fax phone 
number for the organization where this application or proceeding is assigned is (571)- 
273-8300. 

Information regarding the status of an application may be obtained from the 
Patent Application Information Retrieval (PAIR) system. Status information for 
published applications may be obtained from either Private PAIR or Public PAIR. 
Status information for unpublished applications is available through Private PAIR only. 
For more information about the PAIR system, see http://pair-direct.uspto.gov. Should 
you have questions on access to the Private PAIR system, contact the Electronic 
Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a 
USPTO Customer Service Representative or access to the automated information 
system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000. 
LS 

12/18/08 
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